{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Getting Started with Selene\n",
    "\n",
    "This tutorial explores the core components of Selene, and should teach you everything you need to know to train a simple model on biological sequence data. \n",
    "Before starting this tutorial, you need to install Selene.\n",
    "Instructions for installation are available [here](https://selene.flatironinstitute.org/overview/installation.html).\n",
    "Lastly, if you are not familiar with neural networks, we recommend reading through this [introductory PyTorch tutorial on neural networks](https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html).\n",
    "In the simplest case, we train a neural network as follows:\n",
    "\n",
    "1. Construct our neural network, which should be a [`torch.nn.Module`](https://pytorch.org/docs/stable/nn .html#torch.nn.Module) object\n",
    "2. Load the training data and divide it into training and validation sets\n",
    "3. Iterate over the training set\n",
    "4. Compute and backpropagate the training loss after each iteration\n",
    "5. Save the model weights at specified intervals\n",
    "6. Compute and report the loss on the validation set at specified intervals\n",
    "7. Compute and report the loss on the validation set after training is complete\n",
    "\n",
    "In this case, much of our work is already done for us.\n",
    "In fact, we do not actually need to write any code besides our model and a configuration file.\n",
    "\n",
    "## Dependencies\n",
    "\n",
    "In addition to the `selene-sdk` library, we also use [`htslib`](https://www.htslib.org) for data processing. Specifically, we use the tools [**`tabix`**](https://www.htslib.org/doc/tabix.html) and [**`bgzip`**](https://www.htslib.org/doc/bgzip.html) from `htslib`. \n",
    "\n",
    "If you eventually want to use Selene on new data, please download `htslib`. We recommend using `selene-sdk` in a [conda environment](https://conda.io/docs/user-guide/tasks/manage-environments.html) and installing `htslib` (`conda install htslib -c bioconda`) to that same environment. \n",
    "\n",
    "However, if you take this approach, you can only use `tabix` and `bgzip` on the command-line when the conda environment is activated. You can instead download and build the package on your machine by following the instructions on the [`htslib` website](http://www.htslib.org/download/). \n",
    "\n",
    "## Download and format the data\n",
    "\n",
    "First, we need to download the data. (The \"Shortcut\" below contains all the formatted data in a `.tar.gz`.)\n",
    "\n",
    "In this tutorial, we will go through a single-feature example with [data](http://hgdownload.cse.ucsc.edu/goldenpath/hg19/encodeDCC/wgEncodeAwgTfbsUniform/) from the ENCODE Uniform TFBS composite track. These are the transcription factor datasets that were used to train [Zhou and Troyanskaya's (2015)](https://doi.org/10.1038/nmeth.3547) DeepSEA model.\n",
    "\n",
    "We can download the measurements for transcription factor CTCF in cell type GM12878 by running\n",
    "\n",
    "```sh\n",
    "wget http://hgdownload.cse.ucsc.edu/goldenpath/hg19/encodeDCC/wgEncodeAwgTfbsUniform/wgEncodeAwgTfbsUtaGm12878CtcfUniPk.narrowPeak.gz\n",
    "```\n",
    "\n",
    "and format the data with\n",
    "\n",
    "```sh\n",
    "bgzip -d wgEncodeAwgTfbsUtaGm12878CtcfUniPk.narrowPeak.gz\n",
    "\n",
    "cut -f 1-3 wgEncodeAwgTfbsUtaGm12878CtcfUniPk.narrowPeak > GM12878_CTCF.bed\n",
    "\n",
    "sed -i \"s/$/\\tGM12878|CTCF|None/\" GM12878_CTCF.bed\n",
    "\n",
    "sort -k1V -k2n -k3n GM12878_CTCF.bed > sorted_GM12878_CTCF.bed\n",
    "```\n",
    "\n",
    "The formatted BED file should contain 4 columns, in order: chromosome, start, end, feature. We do not support strand-specific data at this time (it will be added in a later version of Selene).\n",
    "\n",
    "In this example, we will use the `IntervalsSampler` class for partitioning and sampling the data. The intervals sampler requires that we pass in an intervals file with 3 columns: chrom, start, end. This intervals file determines where in the genome we sample our data. We have provided an intervals file for you (`deepsea_TF_intervals.txt`) with the regions in the original DeepSEA dataset that contained at least 1 transcription factor (TF). This is included in the Zenodo record below--we do not store it in the repository. We will refer to this file as \"DeepSEA TF regions\" from now on.\n",
    "\n",
    "It also requires that we tabix-index the dataset BED file for fast querying of targets in genomic regions.\n",
    "\n",
    "```sh\n",
    "bgzip -c sorted_GM12878_CTCF.bed > sorted_GM12878_CTCF.bed.gz\n",
    "\n",
    "tabix -p bed sorted_GM12878_CTCF.bed.gz\n",
    "```\n",
    "\n",
    "Selene provides sampler classes to partition your dataset into training/testing/validation sets and will draw examples from the appropriate partitions during the training/evaluation process.\n",
    "\n",
    "These sampler classes require that you have a file containing the distinct genomic features that the model predicts. Note that when we refer to a model's \"features\", we are referring to the genomic features that it predicts (i.e. they are the same as classes, labels, or targets that a deep learning model predicts). \n",
    "\n",
    "```sh\n",
    "cut -f 4 sorted_GM12878_CTCF.bed | sort -u > distinct_features.txt\n",
    "```\n",
    "\n",
    "Finally, we must download the hg19 FASTA file:\n",
    "\n",
    "```sh\n",
    "wget https://www.encodeproject.org/files/male.hg19/@@download/male.hg19.fasta.gz\n",
    "\n",
    "bgzip -d male.hg19.fasta.gz\n",
    "```\n",
    "\n",
    "### SHORTCUT: download all formatted data from Zenodo record\n",
    "\n",
    "Download the compressed data from here:\n",
    "\n",
    "```sh\n",
    "wget https://zenodo.org/record/1443558/files/selene_quickstart.tar.gz\n",
    "```\n",
    "\n",
    "Extract it and `mv` all files from the extracted directory `selene_quickstart_tutorial` to the current directory:\n",
    "\n",
    "```sh\n",
    "tar -xzvf selene_quickstart.tar.gz\n",
    "mv selene_quickstart_tutorial/* .\n",
    "```\n",
    "\n",
    "## Command line arguments\n",
    "\n",
    "At the end of this tutorial, we run Selene using library functions. These are the same functions used in Selene's [command-line interface (CLI)](https://github.com/FunctionLab/selene/blob/master/selene_cli.py), a file that users can copy and use after installing Selene or cloning and building the local version. \n",
    "\n",
    "If you use the CLI script, please install `docopt` (`conda install docopt`), which the CLI relies on to parse input arguments.\n",
    "\n",
    "Selene uses a limited number (two to be precise) of command line arguments.\n",
    "The first of these is the positional parameter for the configuration file, which we will discuss in more detail in the following section.\n",
    "The second argument is the optional named argument for the learning rate, specified with `--lr`.\n",
    "The learning rate only needs to be specified when Selene is training a model, and is ignored in all other circumstances.\n",
    "\n",
    "If you install Selene via conda or pip (we recommend conda), only download and use the CLI script--there is no need to clone the entire repository. You should only do so if you plan on developing or modifying Selene yourself.\n",
    "\n",
    "## Configuration file syntax\n",
    "\n",
    "The configuration file is a [YAML file](https://en.wikipedia.org/wiki/YAML) that specifies the majority of the runtime parameters for Selene.\n",
    "In general, a YAML file with keys `key1` and `key2` taking values `val1` and `val2` would look like such:\n",
    "\n",
    "```YAML\n",
    "---\n",
    "key1: val1\n",
    "key2: val2\n",
    "...\n",
    "```\n",
    "\n",
    "For training a new network, there are a few keys that we must include in this YAML file, which we will discuss later.\n",
    "\n",
    "The following sections explain each of these parameters in some detail.\n",
    "However, we first need to discuss the syntax for our configuration file.\n",
    "We discuss each of the argument types for configuration files below.\n",
    "\n",
    "### Literal arguments\n",
    "\n",
    "The simplest configuration arguments are literals.\n",
    "To specify a learning rate of `0.01`, a random seed for reproducibility, and an output directory, we would include the following lines in our configuration file:\n",
    "```YAML\n",
    "lr: 0.01\n",
    "random_seed: 123\n",
    "output_dir: path/to/output/dir\n",
    "```\n",
    "\n",
    "Note that at the top-level, we do not separate these arguments with commas. \n",
    "\n",
    "### List arguments\n",
    "\n",
    "After literals, lists arguments like `ops` are the next simplest type of configuration parameter.\n",
    "Syntactically, list arguments are very similar to the python lists that they represent.\n",
    "For instance, to specify `ops` as the Python list below:\n",
    "```python\n",
    "ops = [\"train\", \"evaluate\"]\n",
    "```\n",
    "we would write the following line in our configuration file:\n",
    "```YAML\n",
    "ops: [train, evaluate]\n",
    "```\n",
    "\n",
    "### Dictionary arguments\n",
    "\n",
    "The next type of argument we need is a dictionary.\n",
    "Like lists, dictionaries in the configuration file are very similar to their Python equivalents.\n",
    "For instance, if the `model` configuration were written as a dictionary in Python, it might look something like the following:\n",
    "```python\n",
    "model = {\"path\": \"/absolute/path/to/deeperdeepsea.py\",\n",
    "         \"class\": \"DeeperDeepSEA\",\n",
    "         \"class_args\": {\n",
    "             \"sequence_length\": 1000,\n",
    "             \"n_targets\": 1\n",
    "         },\n",
    "         \"non_strand_specific\": \"mean\"\n",
    "        }\n",
    "```\n",
    "Now, to write this in the configuration file, we simply include the following lines:\n",
    "```YAML\n",
    "model: {\n",
    "    path: /absolute/path/to/deeperdeepsea.py,\n",
    "    class: DeeperDeepSEA,\n",
    "    class_args: {\n",
    "        sequence_length: 1000,\n",
    "        n_targets: 1\n",
    "    },\n",
    "    non_strand_specific: mean\n",
    "}\n",
    "```\n",
    "\n",
    "### Function arguments \n",
    "\n",
    "In addition to the types we've just discussed, Selene's configuration accept python function calls.\n",
    "For instance, let's say we want to specify the value of the `features` argument for `train_model`, which takes a list of strings and specifies the names of the values we are predicting with our model.\n",
    "One option would be to write the list of strings into the configuration file, but this might take a long time if this list is very long.\n",
    "If we were using Python, we would just read the list of feature names the following:\n",
    "```python\n",
    "import selene_sdk\n",
    "features = selene_sdk.utils.load_features_list(input_path=\"distinct_features.txt\")\n",
    "```\n",
    "Fortunately, we can use function the function call arguments to include this in our configuration file.\n",
    "Specifically, we would write the following in our configuration file:\n",
    "```YAML\n",
    "features: !obj.selene_sdk.utils.load_features_list {\n",
    "    input_path: <path>/distinct_features.txt\n",
    "}\n",
    "```\n",
    "\n",
    "## Training a model and analyzing sequences with it\n",
    "\n",
    "To train or analyze sequences with a model, we first specify the configuration file and then we execute `selene` from the command line.\n",
    "The first section provides an overview of all the requirements for training a model with Selene.\n",
    "The second section covers the arguments used to evaluate sequences with a trained model.\n",
    "We recommend opening the included `simple_train.yml` configuration file and following along in them while reading through these sections.\n",
    "\n",
    "### Configuration file arguments for training\n",
    "Before running Selene from the command line, we need to specify its runtime parameters in a configuration file.\n",
    "In this example, we need to include the following:\n",
    "\n",
    "| key                 | definition |\n",
    "|---------------------|-----------------------------------------------------------------------------------------------------|\n",
    "| ops                 | list of operations to execute with Selene |\n",
    "| model               | dict containing the configuration parameters for the model we intend to train.\n",
    "| sampler             | a subclass of selene_sdk.samplers.Sampler |\n",
    "| train_model         | a subclass of selene_sdk.TrainModel |\n",
    "| lr                  | a floating point value for the learning rate, if we do not want to specify it in the command line arguments |\n",
    "| random_seed         | an int specifying the random seed for reproducibility |\n",
    "| output_dir          | the str path to the directory to which Selene saves outputs |\n",
    "| create_subdirectory | bool specifying whether to create an output subdirectory in `output_dir` or not (the dirname is the date/time Selene was run) |\n",
    "\n",
    "#### ops\n",
    "\n",
    "Selene currently supports the `train` and `analyze` operations, and allows chaining of operations by simply adding them to the `ops` list in the configuration file.\n",
    "For instance, to train a model and then use it to analyze some data, you would include the following line in the configuration file:\n",
    "```YAML\n",
    "ops: [train, analyze]\n",
    "```\n",
    "(Note that adding the `analyze` operation means you would need to include some additional keys in your configuration.)\n",
    "\n",
    "To only train a model, we would just write the following:\n",
    "```YAML\n",
    "ops: [train]\n",
    "```\n",
    "\n",
    "Each operation comes with its own requirements about what keys are in the configuration file. These keys contain the class configurations required to run a given operation. We have examples of configuration files [here](https://github.com/FunctionLab/selene/tree/master/config_examples) for the main operations that you can run with Selene.\n",
    "\n",
    "In this case, we specify \n",
    "```YAML\n",
    "ops: [train, evaluate]\n",
    "```\n",
    "in our configuration file because we would like to automatically evaluate our model on the test dataset when training completes.\n",
    "\n",
    "#### model\n",
    "\n",
    "In this tutorial, we will use an example neural network slightly modified from [DeepSEA](http://deepsea.princeton.edu), which models chromatin properties of sequences in the non-coding genome. We use a deeper architecture (doubled the number of convolutional layers) because it is better able to discern the sequence patterns associated with the genomic feature we are predicting, which has a very small number of examples (~50,000 regions in `sorted_GM12878_CTCF.bed` that are around ~200bp in length). \n",
    "\n",
    "The class for this model, `DeeperDeepSEA`, is specified in the [`deeperdeepsea.py`](https://github.com/FunctionLab/selene/tree/master/tutorials/getting_started_with_selene/deeperdeepsea.py) file from earlier.\n",
    "The model should follow all of the [normal rules for specifying a `torch.nn.Module`](https://pytorch.org/tutorials/beginner/examples_nn/two_layer_net_module.html), with two exceptions.\n",
    "First, the file with the model class should include a method called `criterion` that returns the object to use for the [PyTorch loss function](https://pytorch.org/docs/master/nn.html#loss-functions).\n",
    "In `deepsea.py`, this is defined as follows:\n",
    "```python\n",
    "def criterion():\n",
    "    return torch.nn.BCELoss()\n",
    "```\n",
    "Second, we must define a method called `get_optimizer` that takes a learning rate, and returns the [optimization function](https://pytorch.org/docs/master/optim.html) and its parameters.\n",
    "The return value should be a 2-tuple, where the first element is the optimizer class, and the second element is a `dict` containing the keyword arguments to use when constructing the optimizer.\n",
    "In `deepsea.py`, this is specified as follows:\n",
    "```python\n",
    "def get_optimizer(lr):\n",
    "    return (torch.optim.SGD, {\"lr\": lr, \"weight_decay\": 1e-6, \"momentum\": 0.9})\n",
    "```\n",
    "Note that, to allow specifying the learning rate at the command line, you should include the passed `lr` argument in the `dict` of keyword arguments.\n",
    "\n",
    "#### sampler\n",
    "\n",
    "The `sampler` argument specifies how Selene will sample its training data.\n",
    "The value for `sampler` should be a function-type argument, and the function needed to construct an object that is a subclass of `selene_sdk.samplers.Sampler`. \n",
    "The specific arguments for the sampler's construction will vary by class, so it is important to check the class definitions and documentation when specifying them.\n",
    "For the example, we will use the following configuration for the `sampler`:\n",
    "```YAML\n",
    "sampler: !obj:selene_sdk.samplers.IntervalsSampler {\n",
    "    reference_sequence: !obj:selene_sdk.sequences.Genome {\n",
    "        input_path: male.hg19.fasta\n",
    "    },\n",
    "    features: !obj:selene_sdk.utils.load_features_list {\n",
    "        input_path: distinct_features.txt\n",
    "    },\n",
    "    target_path: sorted_GM12878_CTCF.bed.gz,\n",
    "    intervals_path: deepsea_TF_intervals.txt,\n",
    "    seed: 127,\n",
    "    sample_negative: True,\n",
    "    sequence_length: 1000,\n",
    "    center_bin_to_predict: 200,\n",
    "    test_holdout: [chr8, chr9],\n",
    "    validation_holdout: [chr6, chr7],\n",
    "    feature_thresholds: 0.5,\n",
    "    mode: train,\n",
    "    save_datasets: [test]\n",
    "}\n",
    "```\n",
    "\n",
    "The intervals sampler samples from regions specified in `intervals_path`. In this case, we provide you with the regions in the DeepSEA dataset that contained at least 1 transcription factor. \n",
    "\n",
    "For your own dataset, you might do something similar. Alternatively, you could sample uniformly from all regions in the genome. If you want to same uniformly across the genome, please look at the documentation for [`selene_sdk.samplers.RandomPositionsSampler`](http://selene.flatironinstitute.org/samplers.html).\n",
    "\n",
    "#### train_model\n",
    "The `train_model` argument is responsible to specifying many of the parameters for `selene_sdk.TrainModel`.\n",
    "The following parameters for `train_model` are automatically generated, and should not be specified in the configuration file:\n",
    "\n",
    "|                |\n",
    "|----------------|\n",
    "| model          |\n",
    "|data_sampler    |\n",
    "|loss_criterion  |\n",
    "|optimizer_class |\n",
    "|optimizer_kwargs|\n",
    "\n",
    "With this in mind, we write the following in our configuration file:\n",
    "```YAML\n",
    "train_model: !obj:selene_sdk.TrainModel {\n",
    "    batch_size: 64,\n",
    "    # typically the number of steps is much higher\n",
    "    max_steps: 16000,  \n",
    "    # the number of mini-batches the model should sample before reporting performance\n",
    "    report_stats_every_n_steps: 2000,\n",
    "    n_validation_samples: 32000,\n",
    "    n_test_samples: 120000,\n",
    "    cpu_n_threads: 32,\n",
    "    use_cuda: False,\n",
    "    data_parallel: False\n",
    "}\n",
    "```\n",
    "\n",
    "#### other arguments\n",
    "\n",
    "There are 5 additional optional arguments we can use when training models: `lr`, `random_seed`, `output_dir`, `create_subdirectory`, and `load_test_set`.\n",
    "If you do not want to specify the learning rate in the command line arguments, you can specify it in the configuration file.\n",
    "However, note that Selene will throw an exception and crash if training is specified in the operations `op` and `lr` is not included in the configuration file or in the command line arguments.\n",
    "If we want to specify it in the configuration file, we can include the following lines:\n",
    "```YAML\n",
    "lr: 0.01\n",
    "```\n",
    "\n",
    "It is recommended that you specify a random seed for `torch`. This is helpful for reproducibility. \n",
    "```YAML\n",
    "random_seed: 1447\n",
    "```\n",
    "Please include the path to an output directory in the configuration file. If the directory does not exist, it is automatically created for you. \n",
    "```YAML\n",
    "output_dir: ./training_outputs\n",
    "```\n",
    "\n",
    "If `create_subdirectory` is True, Selene will create a subdirectory with the current timestamp as the name within `output_directory`. This is so that you can save multiple runs in a single directory:\n",
    "```YAML\n",
    "create_subdirectory: True\n",
    "```\n",
    "\n",
    "If `load_test_set` is True, Selene will create the test dataset at the start of training. This is useful if you are not sure if your job will finish running in time and you want to save the test dataset generated by the sampler for use in a different job or to compare multiple models with the same data. We have set `load_test_set` to `False` because it is not necessary to do so in this situation. \n",
    "```YAML\n",
    "load_test_set: False\n",
    "```\n",
    "\n",
    "### More information about configuration files and the Selene CLI \n",
    "Please review the page [here](http://selene.flatironinstitute.org/overview/cli.html). There are other parameters that can be used for training--and  many other CLI-supported operations in Selene--that are documented and explained in much greater detail. For example, Selene supports [training regression models](https://github.com/FunctionLab/selene/blob/master/tutorials/regression_mpra_example/regression_mpra_example.ipynb); however, this would require you to specify in the appropriate metric (e.g. coefficient of determination) to be reported during model training and evaluation. You can do this through the configuration for `train_model` by adjusting the `metrics` parameter. We do not discuss (or use) this parameter in our current tutorial because, by default, Selene will report ROC AUC and average precision scores, which are metrics are appropriate for our present use case. \n",
    "\n",
    "### Running it\n",
    "\n",
    "Now, it only takes 2 methods to run training in Selene:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "\n",
    "from selene_sdk.utils import load_path\n",
    "from selene_sdk.utils import parse_configs_and_run"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First, we load the configuration file as a dictionary. We provide `./simple_train.yml` for you, but you must review this file and the spots labeled \"`# TODO`\" before running the subsequent cells. For example, you need to replace the path in `model[\"path\"]` with the absolute path to the model architecture file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "configs = load_path(\"./simple_train.yml\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Second, execute the operations that were specified in `ops`. This configuration parsing function will look for the classes/parameters corresponding to the operations \"train\" and \"evaluate\", instantiate those objects, and run the necessary methods."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Outputs and logs saved to ./training_outputs\n",
      "2018-12-09 17:04:24,361 - Creating validation dataset.\n",
      "2018-12-09 17:04:43,859 - 19.496248722076416 s to load 32000 validation examples (500 validation batches) to evaluate after each training step.\n",
      "2018-12-09 17:19:38,640 - [STEP 1000] average number of steps per second: 1.1\n",
      "2018-12-09 17:23:36,334 - validation roc_auc: 0.893867336973852\n",
      "2018-12-09 17:23:36,337 - validation average_precision: 0.3102001612581879\n",
      "2018-12-09 17:23:36,997 - training loss: 0.1302807629108429\n",
      "2018-12-09 17:23:36,998 - validation loss: 0.07270634551346301\n",
      "2018-12-09 17:51:52,927 - [STEP 2000] average number of steps per second: 0.6\n",
      "2018-12-09 17:55:50,346 - validation roc_auc: 0.9512494270169005\n",
      "2018-12-09 17:55:50,349 - validation average_precision: 0.40000877772381327\n",
      "2018-12-09 17:55:50,990 - training loss: 0.07032211124897003\n",
      "2018-12-09 17:55:50,992 - validation loss: 0.05635528636910021\n",
      "2018-12-09 18:24:06,415 - [STEP 3000] average number of steps per second: 0.6\n",
      "2018-12-09 18:28:03,913 - validation roc_auc: 0.9552986985809948\n",
      "2018-12-09 18:28:03,916 - validation average_precision: 0.45141115523347036\n",
      "2018-12-09 18:28:04,553 - training loss: 0.046097103506326675\n",
      "2018-12-09 18:28:04,554 - validation loss: 0.05327987883985043\n",
      "2018-12-09 18:56:18,870 - [STEP 4000] average number of steps per second: 0.6\n",
      "2018-12-09 19:00:16,151 - validation roc_auc: 0.9639399314413266\n",
      "2018-12-09 19:00:16,154 - validation average_precision: 0.44877787773688\n",
      "2018-12-09 19:00:16,857 - training loss: 0.05495147779583931\n",
      "2018-12-09 19:00:16,858 - validation loss: 0.05077911231201142\n",
      "2018-12-09 19:28:32,337 - [STEP 5000] average number of steps per second: 0.6\n",
      "2018-12-09 19:32:30,002 - validation roc_auc: 0.9710667699697066\n",
      "2018-12-09 19:32:30,005 - validation average_precision: 0.4629566566643726\n",
      "2018-12-09 19:32:30,621 - training loss: 0.025935456156730652\n",
      "2018-12-09 19:32:30,622 - validation loss: 0.04881749495817348\n",
      "2018-12-09 20:00:44,177 - [STEP 6000] average number of steps per second: 0.6\n",
      "2018-12-09 20:04:41,542 - validation roc_auc: 0.971648945711097\n",
      "2018-12-09 20:04:41,545 - validation average_precision: 0.4872023052389895\n",
      "2018-12-09 20:04:42,129 - training loss: 0.04498403146862984\n",
      "2018-12-09 20:04:42,130 - validation loss: 0.04800665914593265\n",
      "2018-12-09 20:32:57,976 - [STEP 7000] average number of steps per second: 0.6\n",
      "2018-12-09 20:36:55,343 - validation roc_auc: 0.9730275430484694\n",
      "2018-12-09 20:36:55,346 - validation average_precision: 0.504173195672806\n",
      "2018-12-09 20:36:55,918 - training loss: 0.046578191220760345\n",
      "2018-12-09 20:36:55,920 - validation loss: 0.04690026090014726\n",
      "2018-12-09 21:05:09,697 - Creating test dataset.\n",
      "2018-12-09 21:06:22,323 - 72.62393712997437 s to load 120000 test examples (1875 test batches) to evaluate after all training steps.\n",
      "2018-12-09 21:19:49,463 - test roc_auc: 0.9709206961003511\n",
      "2018-12-09 21:19:49,465 - test average_precision: 0.509259649893447\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYoAAAEWCAYAAAB42tAoAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3XucXVV99/HPd2YyyeRCIBeSkATCJSA3uTShKG2FYhWoglKqUKTFUqk8pRRRW1p8WUv19fig1kcqFmlLqVRu3mhq40MppWItAYZigBAICRCTkEDIjcncz5zf88fek5wMM2f2TObsM2fm+3695pWz77/Zr8z6nbXW3mspIjAzMxtIXbUDMDOz0c2JwszMynKiMDOzspwozMysLCcKMzMry4nCzMzKcqIwM7OynCisoiS9Iqld0u6Sn0P285xnSto4UjFmvOYdkrrS+LdLelDS2/rss0DStyVtk9Qq6XFJ7+uzjyRdI+nZdJ+Nkr4j6cQ8fx+zoXCisDy8PyKmlvy8Ws1gJDUM89CbImIqMB/YBPx9yTlnAP8FdAHHA7OArwJ3Sbqo5BxfA/4IuAaYARwN3A/8+jBjykRSfSXPb2ObE4VVjaTTJf23pJ2SVko6s2TbRyWtltQi6SVJv5+unwL8CDiktIaSfuP/fMnx+9Q60prNn0h6GmiV1JAe9z1JWyW9LOmaLHFHRDtwH3ByyepPALuBKyJiS0S0R8TdwBeAr6Q1icXAHwCXRMR/RERnRLRFxLcj4osD3KMZkv5B0quSdki6P11/uaT/6rNvSDoq/XyHpL+RtFxSK/CnkraUJgxJH0zvB5LqJF0vaV1aI7ovTX5ImiTpn9L1OyU9IWlOlntlY4MThVWFpPnAvwKfJ/lm/Snge5Jmp7u8DrwPOAD4KPBVSadGRCtwLvDqMGool5B8cz8QKAL/AqwkqSGcDVwr6b0ZYp+SnmttyepfA74XEcU+u98HHEpSczgb2BgRj2eMF+BOYDJJLeVgklpKVr9FkqimAV8GWoFf7bP9rvTzNcAHgHcBhwA7gFvSbb8DTAcWAjOBjwPtQ4jDapwTheXh/vSb6M7eb8TAR4DlEbE8IooR8SDQDJwHEBH/GhHrIvFj4N+AX97POG6OiA1pjWApMDsiboyIroh4Cfhb4OIyx39K0k6gBfgl4LKSbbOAzf0cs7lk+8wB9umXpHkkSfHjEbEjIrrTe5HVP0fET9P72wHcTZLgkDSN5F7fne77+8ANEbExIjqBzwEXpc103WnsR0VET0Q8GRFvDiEOq3FOFJaHD0TEgenPB9J1hwG/WZJAdpIUvvMAJJ0raUXacbyTpFCbtZ9xbCj5fBhJ81Xp9f8MKNek8uWIOBBYRPKN+piSbW/0xt7HvJLt2wbYZyALge0RsWMIx5Ta0Gf5LuBCSROBC4H/iYj16bbDgB+U3IvVQA/J/bgTeAC4J20Cu0nShGHGZDXIicKqZQNwZ0kCOTAipkTEF9OC7HskzSVz0sJ5OaD02P6GPG4laaLpNbeffUqP2wC83Of60yLivMECj4ifk3RIf01SU7r634HfkNT3b+pD6bXWAA8BCyQtGewaJTHOkHRgP9v2+X0lDfb7EhHPAetJaimlzU691zq3z/2YFBGb0prMX0TEccA7SZoEfzvj72BjgBOFVcs/Ae+X9F5J9WmH6ZmSFgCNwERgK1CQdC7wnpJjXwNmSppesu5nwHlp5+9c4NpBrv848Gbawd2UxnCCpKVZgk+byl4FrkxXfZWkP+XvJc1Nf59LgBuAT6dNaC8C3wDuTn/XxnS/iyVd3881NpN03H9D0kGSJkj6lXTzSuB4SSdLmkTSVJTFXST9Eb8CfKdk/a3AFyQdBiBptqQL0s9nSTox7Qh/k6Qpqifj9WwMcKKwqoiIDcAFJM09W0m+0X4aqIuIFpLC7D6STtXfApaVHPs8Sdv6S2lTySEkzSMrgVdI+jPuHeT6PcD7SZ5cepmkaejvSDpts/oS8MeSJkbENpKms0nAcyTNTNcBl0VEaSzXAF8n6SjeCawDPkjSsd6fy0gK5udJOvivTeNfA9xIUpN5keTR3CzuBs4E/iMi3ihZ/zWSe/xvklqAFcAvptvmAt8lSRKrgR+TJHobJ+SJi8zMrBzXKMzMrCwnCjMzK8uJwszMynKiMDOzsoY7OFrVzJo1KxYtWlTtMMzMasqTTz75RkTMHnzPt6q5RLFo0SKam5urHYaZWU2RtH7wvfrnpiczMyvLicLMzMpyojAzs7KcKMzMrCwnCjMzK8uJwszMyqpYopB0u6TXJT07wHZJulnSWklPSzq1UrGYmdnwVbJGcQdwTpnt5wKL058rgb+pYCxmZjZMFXvhLiIekbSozC4XAN+KZJzzFZIOlDQvnazFrGr6Dr1fbnmgz73Lra2tFIvFPcu9+5T+29nZOeA5sq4baLqALMe2t7dTV1eHpLL7Dfd6WePwufI711BV883s+ew7p+/GdN1bEoWkK0lnEjv00ENzCc7yVSwW6enpoVgs0tnZuc/nYrG456e1tRVJdHd37yngegvgzs5Ouru7qa+v77cA7/tv7+eIoKur6y0x9S04yy3397lYLDJx4kQmTZqEpD3rS/+VREQgiYkTJ5a9XpaYBlpX7ti6ujoaGhpoaBi4OBjonEO5Xrn15c7vc43cuYarmomiv9+k39QXEbcBtwEsWbLEMy3loKuri0KhQFtb257CNCJob28H3vrtuPenra1tT4EOsGvXLhobG/fs19+xAIVCgYaGBurr6+nu7mbKlClMmDABgPr6epqampBEsVhk6tSpNDU1ccABB9DU1LTn27Ak6urq9lyvv8K777+9n+vr68sWlGbjWTX/MjYCC0uWF5DMQWwV0tXVRWtrK7t37yYiKBaLRAS7du1i69atdHR0UF9fT09PMh1yU1MThUKB6dOn7ymEe78lNzQ0vOVbsiQmTJjApEmTmDVrFlOmTKG+vp7Gxsa37Nf7ufffuro66ur8EJ7ZaFTNRLEMuFrSPSRz8+5y/0R5PT09tLe3s337dgqFAsVikY6ODnp6emhra6NQKLB79+59mjR6f3oL/4aGBqZNm0ZTUxMTJkzY0+zwtre9jdmzZ+9ZN9JVVzOrXRVLFJJ6J3GfJWkj8OfABICIuBVYDpwHrAXagI9WKpZa9Prrr9Pe3k5LSwtbtmzZ0+RTV1fH1KlT9zS/NDQ00NjYyMyZM2lsbKSxsXGfNvHSn/r6+ir/VmZWiyr51NMlg2wP4A8qdf1asmPHDjZs2EBLSwsdHR20tbUBMGfOHKZOncrJJ5/MAQccsKft3cwsT+69y1lXVxednZ1s2rSJN998k+3bt9Pd3c2CBQtYsGDBnmah3lqBmVm1OVHk4M0332TlypXs3LkTgMbGRqZMmcLcuXM58sgjmTlzZpUjNDMbmBNFhfT09PDyyy/zyiuv0N7ezpFHHsnSpUuZOHGiawpmVlOcKEZAsVjktddeY8eOHWzevHlPH8OkSZM48cQTmTNnjpODmdUsJ4r91NLSwiOPPEJEMG/ePObNm8ehhx7KlClTnBzMbExwohimjo4OXnzxRV555RWOOOIIjj32WL8wZmZjkhPFEEUE69atY/Xq1cycOZMzzjiDGTNmVDssM7OKcaIYgojgySefZPv27SxdupS5c+dWOyQzs4pzosioUCjQ3NxMS0sLZ5111p4B68zMxjonikH09PSwdu1a1qxZw/Tp0znzzDOdJMxsXHGiKKOtrY3HHnsMgNNPP53Zs2dXOSIzs/w5UQygo6ODhx56iEMOOYRTTjnFTzSZ2bjlRNGPHTt28NRTTzF//nxOOeUUvw9hZuOaE0Ufmzdvprm5mUWLFnHCCSc4SZjZuOdEUWLr1q00NzezZMkS5s2bV+1wzMxGBTe8pyKCVatWcdJJJzlJmJmVcI2CZFC/Rx99lIaGBhYuXDj4AWZm44hrFMCjjz4KwDvf+U73SZiZ9THuE8WLL75Ie3s773jHO/wIrJlZP8Z1ybhhwwbWrFnD0qVLnSTMzAYwbkvHQqHA888/z9KlS5k+fXq1wzEzG7XGbaJYt24dU6ZM4eCDD652KGZmo9q4TBSdnZ2sXbuWY489ttqhmJmNeuMyUaxZs4b58+dz0EEHVTsUM7NRb9wlil27drF+/XqOOeaYaodiZlYTxl2iWL16NYsXL6apqanaoZiZ1YRxlSjWr1/Pjh07OOqoo6odiplZzRg3iaKrq4vVq1ezZMkS6uvrqx2OmVnNGDeJYtWqVcyZM8ez1JmZDdG4SBQdHR28+uqrHH/88dUOxcys5oyLRLFp0ybmzp1LY2NjtUMxM6s54yJRbN68mblz51Y7DDOzmlTRRCHpHEkvSFor6fp+th8q6WFJT0l6WtJ5Ix3D7t272blzJ3PmzBnpU5uZjQsVSxSS6oFbgHOB44BLJB3XZ7fPAPdFxCnAxcA3RjqOTZs2cdhhh9HQ4DmazMyGo5I1itOAtRHxUkR0AfcAF/TZJ4AD0s/TgVdHOogtW7a42cnMbD9UMlHMBzaULG9M15X6HPARSRuB5cAf9nciSVdKapbUvHXr1swBdHR00NbWxqxZs4YUuJmZ7VXJRNHfnKLRZ/kS4I6IWACcB9wp6S0xRcRtEbEkIpYM5T2IrVu3MmvWLE9vama2HyqZKDYCC0uWF/DWpqUrgPsAIuJRYBIwYl//N2/e7E5sM7P9VMlE8QSwWNLhkhpJOquX9dnn58DZAJKOJUkU2duWyujq6mLbtm3unzAz208VSxQRUQCuBh4AVpM83bRK0o2Szk93+yTwMUkrgbuByyOib/PUsGzZsoWDDjrIL9mZme2nij4zGhHLSTqpS9d9tuTzc8AZlbj2li1bPM2pmdkIGJNvZhcKBd544w3mzZtX7VDMzGremEwU69ev58ADD/TkRGZmI2BMJopt27axaNGiaodhZjYmjLlEUSwW2bZtGwceeGC1QzEzGxPGXKLYtWsXEydOZPLkydUOxcxsTBhziWLDhg3uxDYzG0FjLlH4aSczs5GVKVFIapR0VKWD2V/t7e10dXUxffr0aodiZjZmDJooJP068AzwYLp8sqQfVDqw4WhpaWH69OkeBNDMbARlqVHcCPwisBMgIn4GjMraxfbt25k2bVq1wzAzG1OyJIruiNjZZ92IjMc00jZt2uRBAM3MRliWsZ5WS/oQUCfpcOCPgBWVDWvoenp66OjoYObMmdUOxcxsTMlSo7ga+AWgCHwf6CBJFqPK7t27mTx5svsnzMxGWJYaxXsj4k+AP+ldIelCkqQxauzatctvY5uZVUCWGsVn+ll3w0gHsr/a29v9NraZWQUMWKOQ9F7gHGC+pL8q2XQASTPUqNLR0eEahZlZBZRrenodeJakT2JVyfoW4PpKBjUcra2tHHLIIdUOw8xszBkwUUTEU8BTkr4dER05xjQsnZ2dTJo0qdphmJmNOVk6s+dL+gJwHLCnJI6IoysW1TDs3r2biRMnVjsMM7MxJ0tn9h3APwACzgXuA+6pYExD1tbWBkBjY2OVIzEzG3uyJIrJEfEAQESsi4jPAGdVNqyhaW9vZ8aMGdUOw8xsTMrS9NSp5C22dZI+DmwCDq5sWEPT2dnpZiczswrJkig+AUwFrgG+AEwHfreSQQ2VE4WZWeUMmigi4rH0YwtwGYCkBZUMaqhaWlqYOnVqtcMwMxuTyvZRSFoq6QOSZqXLx0v6FqNsUMDW1lYnCjOzChkwUUj638C3gUuB/yfpBuBhYCUwqh6N7ejooKmpqdphmJmNSeWani4AToqIdkkzgFfT5RfyCS27jo4Ov2xnZlYh5ZqeOiKiHSAitgPPj8Yk0dXVBcCECROqHImZ2dhUrkZxhKTeocQFLCpZJiIurGhkGbnZycysssolit/os/z1SgYyXG1tbU4UZmYVVG5QwIfyDGS4PBigmVllZRnCY1TbvXu3E4WZWQVVNFFIOkfSC5LWSup3DgtJH5L0nKRVku4a6jUKhQINDVleMDczs+HIXMJKmhgRnUPYvx64Bfg1YCPwhKRlEfFcyT6LgT8FzoiIHZKGPIZUa2srs2bNGuphZmaW0aA1CkmnSXoGeDFdPknSX2c492nA2oh4KSK6SIYmv6DPPh8DbomIHQAR8fqQoiepUUyZMmWoh5mZWUZZmp5uBt4HbAOIiJVkG2Z8PrChZHljuq7U0cDRkn4qaYWkczKcdx/d3d1+h8LMrIKyND3VRcT6ZKTxPXoyHKd+1kU/118MnAksAH4i6YSI2LnPiaQrgSsBDj300H1O4JFjzcwqK0uNYoOk04CQVC/pWmBNhuM2AgtLlheQDAPSd59/jojuiHgZeIEkcewjIm6LiCURsWT27Nl71nd3dwO4M9vMrIKyJIqrgOuAQ4HXgNPTdYN5Algs6XBJjcDFwLI++9xP2oyVjlB7NPBSttCTme3q6mr+CV8zs1Ety1fxQkRcPNQTR0RB0tXAA0A9cHtErJJ0I9AcEcvSbe+R9BxJc9anI2Jb1msUCgWmTZs21NDMzGwIsiSKJyS9ANwLfD8iWrKePCKWA8v7rPtsyecgqa1cl/WcpTo6Otw/YWZWYYO220TEkcDngV8AnpF0v6Qh1zAqwYnCzKzyMjXwR8R/R8Q1wKnAmyQTGlVde3s7kydPrnYYZmZjWpYX7qZKulTSvwCPA1uBd1Y8sgxaWlpobGysdhhmZmNalj6KZ4F/AW6KiJ9UOJ4h6enp8RDjZmYVliVRHBERxYpHMgyFQsFvZZuZVdiAiULSVyLik8D3JPV9o3pUzHDX3d3tpiczsworV6O4N/13VM5sB8nwHU4UZmaVVW6Gu8fTj8dGxD7JIn2Rrqoz4BUKBYrFIvX19dUMw8xszMvyeOzv9rPuipEOZKi6urpcmzAzy0G5PooPk4zPdLik75dsmgbs7P+o/HR3d3swQDOzHJQraR8nmYNiAclMdb1agKcqGVQWHR0dnivbzCwH5fooXgZeBv49v3CyiwiPHGtmloNyTU8/joh3SdrBvhMOiWQ8vxkVj66MQqHgcZ7MzHJQrumpd7rTWXkEMlSFQsF9FGZmORiw7abkbeyFQH1E9ADvAH4fmJJDbGV1dHT4rWwzsxxkaeS/n2Qa1COBbwHHAndVNKoMIoI+83ibmVkFZEkUxYjoBi4E/m9E/CEwv7JhDc59FGZm+ciSKAqSfhO4DPhhuq7qbT6dnZ3VDsHMbFzI+mb2WSTDjL8k6XDg7sqGlY1rFGZmlTfoY0MR8ayka4CjJL0NWBsRX6h8aOX5qSczs3wMWtJK+mXgTmATyTsUcyVdFhE/rXRw5fT09DhRmJnlIEtJ+1XgvIh4DkDSsSSJY0klAxtMoVDwyLFmZjnI0kfR2JskACJiNVD1YVs9eqyZWT6y1Cj+R9I3SWoRAJcyCgYF7O7u9gt3ZmY5yJIoPg5cA/wxSR/FI8BfVzKowRSLRYrFovsozMxyULaklXQicCTwg4i4KZ+QBtfT0+P+CTOznAzYRyHpz0iG77gUeFBSfzPdVUV3dzfFYnHwHc3MbL+Vq1FcCrw9IlolzQaWA7fnE1Z5Hr7DzCw/5Z566oyIVoCI2DrIvrkqFAruyDYzy0m5GsURJXNlCziydO7siLiwopGV0dPT45FjzcxyUi5R/Eaf5a9XMpCh6OzspKmpqdphmJmNC+XmzH4oz0CGIiIG38nMzEbEqOl3GIpisejObDOznFQ0UUg6R9ILktZKur7MfhdJCkmZxo/q6emhrq4mc5yZWc3JXNpKGtJXeEn1wC3AucBxwCWSjutnv2kkb34/lvXcxWLRL9yZmeVk0EQh6TRJzwAvpssnScoyhMdpJHNXvBQRXcA9wAX97PeXwE1AR9agPXKsmVl+stQobgbeB2wDiIiVJDPeDWY+sKFkeSN95tqWdAqwMCJ+SBmSrpTULKl569atrlGYmeUoS6Koi4j1fdb1ZDiuvxcd9jyuJKmOZK6LTw52ooi4LSKWRMSS2bNnu4/CzCxHWUrbDZJOA0JSvaRrgTUZjtsILCxZXgC8WrI8DTgB+E9JrwCnA8uydGh7djszs/xkSRRXAdcBhwKvkRToV2U47glgsaTDJTUCFwPLejdGxK6ImBURiyJiEbACOD8imgc7sUePNTPLz6BfyyPidZJCfkgioiDpauABoB64PSJWSboRaI6IZeXPMDAnCjOz/AyaKCT9LSV9C70i4srBjo2I5SSjzpau++wA+5452Pl6OVGYmeUnS0P/v5d8ngR8kH2fZsqdE4WZWX6yND3dW7os6U7gwYpFlEFnZ6efejIzy8lwStvDgcNGOpCh8HsUZmb5ydJHsYO9fRR1wHZgwHGb8uJEYWaWj7KJQsnsQCcBm9JVxRgFY3y7j8LMLD9lm57SpPCDiOhJf6qeJMCJwswsT1n6KB6XdGrFIxmCYrHozmwzs5wM2PQkqSEiCsAvAR+TtA5oJRnDKSKiaslDkufMNjPLSbk+iseBU4EP5BRLZk4SZmb5KZcoBBAR63KKJZOIcLOTmVmOyiWK2ZKuG2hjRPxVBeLJxCPHmpnlp1yJWw9Mpf95JaomIvzEk5lZjsolis0RcWNukWQUEfT0ZJk3yczMRkK5xv5RVZMoNWnSpGqHYGY2bpRLFGfnFsUQuDPbzCxfA5a4EbE9z0CGwonCzCw/NVfiRoTfozAzy1HNJQpIkoWZmeWjJhNFY2NjtUMwMxs3ai5RuDPbzCxfNVniOlGYmeWn5krciKBYLFY7DDOzcaPmEgXAhAkTqh2Cmdm4UZOJwk1PZmb5qckS14nCzCw/NVfi+qknM7N81VyJ685sM7N81VyiAL9wZ2aWp5pMFG56MjPLT82VuO6jMDPLV02WuO6jMDPLT00mCr9wZ2aWn4omCknnSHpB0lpJ1/ez/TpJz0l6WtJDkg7Lcl43PZmZ5adiJa6keuAW4FzgOOASScf12e0pYElEvB34LnDTYOf1xEVmZvmq5Ffz04C1EfFSRHQB9wAXlO4QEQ9HRFu6uAJYkOXErlGYmeWnkiXufGBDyfLGdN1ArgB+1N8GSVdKapbUvHPnTtcozMxyVMlE0V9p3u8cppI+AiwBvtTf9oi4LSKWRMSS6dOnj2CIZmY2mIYKnnsjsLBkeQHwat+dJL0buAF4V0R0Zjmxm57MzPJTyRL3CWCxpMMlNQIXA8tKd5B0CvBN4PyIeD3rievr60c0UDMzG1jFEkVEFICrgQeA1cB9EbFK0o2Szk93+xIwFfiOpJ9JWjbA6UrP6xqFmVmOKtn0REQsB5b3WffZks/vHs55nSjMzPJTcyVusVj0U09mZjmquUQhyX0UZmY5qrlEAbhGYWaWIycKMzMrq+YShZ96MjPLV02WuK5RmJnlpyYThWsUZmb5qckS1zUKM7P81GSiMDOz/DhRmJlZWU4UZmZWlhOFmZmVVXOJwh3ZZmb5qrlEYWZm+XKiMDOzspwozMysrJpLFO6jMDPLV80lCjMzy5cThZmZleVEYWZmZdVconAfhZlZvmouUZiZWb6cKMzMrKyaSxRuejIzy1fNJQozM8uXE4WZmZXlRGFmZmU5UZiZWVlOFGZmVlbNJQo/9WRmlq+aSxRmZpYvJwozMyvLicLMzMqqaKKQdI6kFyStlXR9P9snSro33f6YpEUZzlmJUM3MbAAVSxSS6oFbgHOB44BLJB3XZ7crgB0RcRTwVeD/VCoeMzMbnkrWKE4D1kbESxHRBdwDXNBnnwuAf0w/fxc4W64ymJmNKg0VPPd8YEPJ8kbgFwfaJyIKknYBM4E3SneSdCVwZbrYKenZikRce2bR516NY74Xe/le7OV7sdcxwz2wkomiv5pBDGMfIuI24DYASc0RsWT/w6t9vhd7+V7s5Xuxl+/FXpKah3tsJZueNgILS5YXAK8OtI+kBmA6sL2CMZmZ2RBVMlE8ASyWdLikRuBiYFmffZYBv5N+vgj4j4h4S43CzMyqp2JNT2mfw9XAA0A9cHtErJJ0I9AcEcuAvwfulLSWpCZxcYZT31apmGuQ78Vevhd7+V7s5Xux17DvhfwF3szMyvGb2WZmVpYThZmZlTVqE0Ulhv+oVRnuxXWSnpP0tKSHJB1WjTjzMNi9KNnvIkkhacw+GpnlXkj6UPp/Y5Wku/KOMS8Z/kYOlfSwpKfSv5PzqhFnpUm6XdLrA71rpsTN6X16WtKpmU4cEaPuh6Tzex1wBNAIrASO67PP/wJuTT9fDNxb7bireC/OAiann68az/ci3W8a8AiwAlhS7bir+P9iMfAUcFC6fHC1467ivbgNuCr9fBzwSrXjrtC9+BXgVODZAbafB/yI5B2204HHspx3tNYoPPzHXoPei4h4OCLa0sUVJO+sjEVZ/l8A/CVwE9CRZ3A5y3IvPgbcEhE7ACLi9ZxjzEuWexHAAenn6bz1na4xISIeofy7aBcA34rECuBASfMGO+9oTRT9Df8xf6B9IqIA9A7/MdZkuRelriD5xjAWDXovJJ0CLIyIH+YZWBVk+X9xNHC0pJ9KWiHpnNyiy1eWe/E54COSNgLLgT/MJ7RRZ6jlCVDZITz2x4gN/zEGZP49JX0EWAK8q6IRVU/ZeyGpjmQU4svzCqiKsvy/aCBpfjqTpJb5E0knRMTOCseWtyz34hLgjoj4iqR3kLy/dUJEFCsf3qgyrHJztNYoPPzHXlnuBZLeDdwAnB8RnTnFlrfB7sU04ATgPyW9QtIGu2yMdmhn/Rv554jojoiXgRdIEsdYk+VeXAHcBxARjwKTSAYMHG8ylSd9jdZE4eE/9hr0XqTNLd8kSRJjtR0aBrkXEbErImZFxKKIWETSX3N+RAx7MLRRLMvfyP0kDzogaRZJU9RLuUaZjyz34ufA2QCSjiVJFFtzjXJ0WAb8dvr00+nArojYPNhBo7LpKSo3/EfNyXgvvgRMBb6T9uf/PCLOr1rQFZLxXowLGe/FA8B7JD0H9ACfjoht1Yu6MjLei08CfyvpEyRNLZePxS+Wku4maWqclfbH/DkwASAibiXpnzkPWAu0AR/NdN4xeK/MzGwEjdamJzMzGyWcKMzMrCwnCjMzK8uJwszMynKiMDOzspwobNSR1CPpZyU/i8rsu2inngcGAAADp0lEQVSgkTKHeM3/TEcfXZkOeXHMMM7xcUm/nX6+XNIhJdv+TtJxIxznE5JOznDMtZIm7++1bfxyorDRqD0iTi75eSWn614aESeRDDb5paEeHBG3RsS30sXLgUNKtv1eRDw3IlHujfMbZIvzWsCJwobNicJqQlpz+Imk/0l/3tnPPsdLejythTwtaXG6/iMl678pqX6Qyz0CHJUee3Y6h8Ez6Vj/E9P1X9TeOUC+nK77nKRPSbqIZMytb6fXbEprAkskXSXpppKYL5f018OM81FKBnST9DeSmpXMPfEX6bprSBLWw5IeTte9R9Kj6X38jqSpg1zHxjknChuNmkqanX6Qrnsd+LWIOBX4MHBzP8d9HPhaRJxMUlBvTIdr+DBwRrq+B7h0kOu/H3hG0iTgDuDDEXEiyUgGV0maAXwQOD4i3g58vvTgiPgu0Ezyzf/kiGgv2fxd4MKS5Q8D9w4zznNIhunodUNELAHeDrxL0tsj4maSsXzOioiz0qE8PgO8O72XzcB1g1zHxrlROYSHjXvtaWFZagLw9bRNvodk3KK+HgVukLQA+H5EvCjpbOAXgCfS4U2aSJJOf74tqR14hWQY6mOAlyNiTbr9H4E/AL5OMtfF30n6VyDzkOYRsVXSS+k4Oy+m1/hpet6hxDmFZLiK0hnKPiTpSpK/63kkE/Q83efY09P1P02v00hy38wG5ERhteITwGvASSQ14bdMShQRd0l6DPh14AFJv0cyrPI/RsSfZrjGpaUDCErqd36TdGyh00gGmbsYuBr41SH8LvcCHwKeB34QEaGk1M4cJ8ksbl8EbgEulHQ48ClgaUTskHQHycB3fQl4MCIuGUK8Ns656clqxXRgczp/wGUk36b3IekI4KW0uWUZSRPMQ8BFkg5O95mh7HOKPw8sknRUunwZ8OO0TX96RCwn6Sju78mjFpJhz/vzfeADJHMk3JuuG1KcEdFN0oR0etpsdQDQCuySNAc4d4BYVgBn9P5OkiZL6q92ZraHE4XVim8AvyNpBUmzU2s/+3wYeFbSz4C3kUz5+BxJgfpvkp4GHiRplhlURHSQjK75HUnPAEXgVpJC94fp+X5MUtvp6w7g1t7O7D7n3QE8BxwWEY+n64YcZ9r38RXgUxGxkmR+7FXA7STNWb1uA34k6eGI2EryRNbd6XVWkNwrswF59FgzMyvLNQozMyvLicLMzMpyojAzs7KcKMzMrCwnCjMzK8uJwszMynKiMDOzsv4/d/9anl9G1NUAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYoAAAEWCAYAAAB42tAoAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3Xt8VOd95/HPb3RBIHGVBDJIIDDiKoPAwmBwDNj4Alh22+0mdpOmTrNxu2026faym+72tU3S9rXbdNPd7m66iduk7mbbpEn7atYY2/gCGJuLLcQdBDYXgQQYEOYqQCDpt3+cIzEIaTQIjUYjfd+v17yYc5nn/OYMmt88z3Oe55i7IyIi0plIsgMQEZG+TYlCRERiUqIQEZGYlChERCQmJQoREYlJiUJERGJSohAJmdl4M7tsZmld7PdZM3ujt+JKBDN73szei1p2M5uczJik71Ki6MfMrMbMroZffq2PsXdZ5hIzq+upGPsSdz/m7jnu3tzFfn/n7o/3VlwiyaZE0f9VhF9+rY8TyQzGzNJTufzeZIF+8zfaVU1N+q5+859Q7oyZLTCzTWZ23sx2mtmSqG1fMLNqM7tkZofN7NfC9dnAa8DY6BqKmb1kZn8c9fpbah1hzebfm9kuoMHM0sPX/ZOZnTGzI2b2lRixvmRm3zWzN8OY3jGzCVHb3cx+08w+Aj4K100L9//EzA6Y2aej9h9sZt82s6NmdsHM3gvXFYdlpYf7PR++/0thjJ+NWh/dbLPQzCrDsirNbGHUtvVm9kdmtjEs5w0zy4vxXteb2Z+Y2UbgCjDJzIab2ffN7KSZHTezP47+0jWzL0V9XvvMbG64/mtmdihq/c93dtxYzGyUmf2NmZ0ws3Nm9rOOzkPUZzE56nP732b2qpk1AL9vZh+3i/3nw/8XmFkkKuazZvYTMxsVbssys/8brj8fnucx3Xk/0g3urkc/fQA1wLIO1o8DzgIrCH4sPBYu54fbVwL3AgYsJvjCmhtuWwLUtSvvJeCPo5Zv2SeMYwdQBAwOj1kF/CcgE5gEHAae6OR9vARcAh4GBgF/AbwXtd2BN4FRYfnZQC3wBSAdmAvUAzPD/b8DrA/PQxqwMCy3OCwrPSzjIjA1fM09Ua9/vvX44THPAb8cvu65cDk33L4eOARMCWNbD/yXGJ/ZeuAYMDMsLwP4GfC9MKbRwAfAr4X7/0vgODAv/LwmAxOito0Nz/dngAbgnvbvIeocTu4kptXAPwAjw3gWd1RG+3LCz+0CsCiMISs8F49F7f9T4Gvh898CtgCF4efxPeBH4bZfA1YBQ8LP7H5gWLL/xgbKI+kB6JHADzf4gr4MnA8fPwvX/3vgh+32XQP8Sifl/Az4avh8Cd1LFL8atTwfONaujN8H/qaT478E/DhqOQdoBorCZQceidr+GeDddmV8D/jD8AvrKjC7g+MUc2uiOA/8C2Bwu/3aviAJEsQH7bZvBp4Pn68H/iBq228Ar8f4zNYD34xaHgM0RsdAkIzWRX1uX43z/8MO4Jn27yHqHN6WKAgSZAswsoNtt5TRvpzwc/s/7bb/MfCD8PlQguQ1IVyuBh5td+wb4efxq8AmYFay/64G4kNNT/3fz7n7iPDxc+G6CcC/DKvw583sPPAQwR8mZrbczLaEzTbnCWoenTaXxKk26vkEguar6OP/B4IvxS5f7+6XgU8Ifi13Vv78duV/FigI30frL9tOuXsDQcL5deCkma02s2kd7DoWONpu3VGC2kqrj6OeXyFIdITNaa1NeP8hxnvJCGNofS/fI6hZQFBL6/C9mNnnzWxH1OtKufPPsQj4xN3P3eHrWtW2W/574BfMbBDwC8A2d289fxOAf46Kt5rgB8EY4IcESfHHYRPYt8wso5sxyR3qNx1/ckdqCWoUX2q/IfwD/ifg88D/c/cbYZu0hbt0NN1wA0GTQKuCDvaJfl0tcMTdS+4g5qKoGHMImnyiO+bbl/+Ouz/WvhALOoevETSt7Yx1QHdfA6wxs8EEv4T/CvhUu91OEHzBRRsPvB6r7LD8XydIRLdtinpeS1CjyHP3pg72rSV4L7cI+3D+CngU2OzuzWa2g5ufY7xqgVFmNsLdz7fbdsvnbmZdfe64+z4zOwosB36JIHFEH+tX3X1jJ7F8A/iGmRUDrwIHgO/H/1aku1SjGJj+L1BhZk+YWVrYUbjEzAoJ+gwGAWeAJjNbDkRfCnoKyDWz4VHrdgArwk7PAoK25lg+AC5a0ME9OIyh1MzmxXjNCjN7yMwygT8C3nf39r9WW70CTDGzXzazjPAxz8ymu3sL8APgzy3oUE8zswfDBNnGzMaY2dMWdOA3EjThdXTZ7KvhsX7Jgk76zwAzwhjumrufBN4Avm1mw8IO33vNbHG4y18Dv2tm91tgcpgksgm+pM+E7+cLBDWK7hz/NeAvzWxkeC4fDjfvBGaaWZmZZQFfj7PYvwe+QtDn9NOo9d8F/iSMHzPLN7NnwudLzey+sCP8IkGTVMzLmKXnKFEMQOEX7DMEzT1nCH7J/R4QcfdLBH/EPyHolP0l4OWo1+4HfgQcDpsIxhI0C+wk6It4g6DjM9bxm4EKoAw4QtDR/NfA8Bgv+3uCPoZPCDoyPxuj/EsEye1Zgl/8HwN/SpAAAX4X2A1UhuX9Kbf/LUSA3wlf/wlBp/5vdHCss8BT4b5ngX8HPOXu9THey536PEEC30fwmfwjYTOhu/8U+BOC83OJoD9plLvvA75N0F9yCrgP6OyXeld+meCLeT9wmvCHgLt/CHwTeIvgarP3OiugnR8R9GOtbXee/oLg/9obZnaJoGN7fritgOB9XyRoknqH4AeP9AILO41E+iwze4mgc/wPkh2LyECkGoWIiMSkRCEiIjGp6UlERGJSjUJERGJKuXEUeXl5XlxcnOwwRERSSlVVVb2753fntSmXKIqLi9m6dWuywxARSSnhQMduUdOTiIjEpEQhIiIxKVGIiEhMShQiIhKTEoWIiMSkRCEiIjElLFGY2Q/M7LSZ7elku5nZ/zCzg2a2y8L7/IqISN+SyBrFS8CTMbYvB0rCxwvA/05gLCIi0k0JSxTuvoFgHv/OPENwP1139y3ACDO7p6tyGxoaeipEERGJQzL7KMZx6/1067j1PsNtzOwFM9tqZlvPnj3bK8GJiEggmYmio3v3djiVrbu/6O7l7l4+fHism6CJiEhPS2aiqAOKopYLCW47KSIifUgyE8XLwOfDq58WABfCG7mLiEgfkrDZY82s9QbqeWZWB/whkAHg7t8FXgVWAAeBK8AXEhWLiIh0X8IShbs/18V2B34zUccXEZGeoZHZIiISkxKFiIjEpEQhIiIxKVGIiEhMShQiIhKTEoWIiMSkRCEiIjEpUYiISExKFCIiEpMShYiIxKREISIiMSlRiIhITEoUIiISkxKFiIjEpEQhIiIxKVGIiEhMShQiIhKTEoWIiMSkRCEiIjEpUYiISExKFCIiEpMShYiIxKREISIiMSlRiIhITEoUIiISkxKFiIjEpEQhIiIxKVGIiEhMShQiIhKTEoWIiMSkRCEiIjEpUYiISEwJTRRm9qSZHTCzg2b2tQ62jzezdWa23cx2mdmKRMYjIiJ3LmGJwszSgO8Ay4EZwHNmNqPdbn8A/MTd5wDPAn+ZqHhERKR7ElmjeAA46O6H3f068GPgmXb7ODAsfD4cOBFPwZcvX+6xIEVEJLZEJopxQG3Ucl24LtrXgc+ZWR3wKvBvOirIzF4ws61mtvXChQusX78+AeGKiEhHEpkorIN13m75OeAldy8EVgA/NLPbYnL3F9293N3Lhw8fjnv7YkREJFESmSjqgKKo5UJub1r6IvATAHffDGQBeQmMSURE7lAiE0UlUGJmE80sk6Cz+uV2+xwDHgUws+kEieJMAmMSEZE7lLBE4e5NwJeBNUA1wdVNe83sm2b2dLjb7wBfMrOdwI+A513tSiIifUp6Igt391cJOqmj1/2nqOf7gEWJjEFERO6ORmaLiEhMKZ8oGhsbdbmsiEgCpXyiOHfuHJcuXUp2GCIi/VbKJ4rWJNHY2Mj27ds1xkJEpIcltDM70d59913Onz8PwO7duzl58iRmRllZWZIjExHpP1K2RnHt2rW2JAFw8uRJAGpra6mvr09WWCIi/U7KJoo333yz022bN2/m2rVrXZZRV1fHhQsXblnn7rS0tNx1fCIi/UVKNz3FUlNTQ0lJCWlpaTQ1NeHuZGRktG1fvXo1LS0t5ObmsnDhQgCuXr3KW2+9BUBeXh4zZ87k3LlzjB8/HrOOpq4SEen/+k2iKCgo4P7772f16tUAfPTRR3z00UcsWrSIjRs3ArBy5UoikQjnzp1rqzWcPXuWt956i6tXr95SXn19Pe+88w4AY8aMISsrqxffjYhI35GyTU/tjRgxgkgkwrJly25Z35okAM6cCaaR2r59+y37RCeJcePaz4QOFy9e7MlQRURSSr+pUQwdOhSAwYMHd7pPZmYmly9fpqGhgfLycoYPH87bb7/dtr2iogKAuXPnAkF/xSuvvIK7U19fz8iRI0lLS0vguxAR6Xv6TaIYOXJk2/PFixezf/9+Tp06BUBxcXFbp/W6desAuOeee4CbyaEjrf0SH3zwAQA5OTksXbq054MXEenDUjpRDBs2jLlz57bVJqLXP/DAA6xatQqA++67j1WrVvHee+8BMHHixG4d705vwXr8+HGGDBnSdtxWsZKTiEhfk9KJYuzYsbclifZaaw7RSktL4z7GvHnzqKysbFs+dOgQzc3N1NbW8uijj3b4mhs3bvD66693WubWrVsZMWIE1dXVt6xfuHAhubm5cccmItIbUjpRpKfHDn/p0qUMGTLklnWFhYV3dIyCggIqKipobGzkjTfeYN++fR3ut2fPHvLy8m5JKqWlpRw4cIAnn3yybd2qVas4efJk2wDBaJs2bWqrbTQ3N2NmRCL95noDEUlRlmpzI5WUlPif//mfA1BeXt5hjaEj169f5/r16+Tk5HT72A0NDaxdu/aWdU8++WSHtYcnnniCzMzM29a3tLRQV1dHYWHhLUlg165dHD169JZ9Bw0axJIlS8jIyNA4DhG5K2ZW5e7l3XltStco7uQKpMzMzA6/uO9EdnZ22y/+1v6P1iQxfPhwLly4wIIFC8jPz++0jEgkwvjx4zuMr73GxkbWrFkDQFFRkeawEpGkSOlEkZeXl7RjV1RUtCWLsrIyioqK7qq8adOmMW3atLbla9eu8eabbzJjxgz27dtHbW0ttbW1LF++vK3J7caNG6Snp6u2ISIJldJNTwPp6qHq6moOHjwIQFZWVttcVtnZ2cyfP5+srCzS0tK4cuUKkUiEmpoaBg8eTENDA1OnTtX4D5EBbkA2PcUaWNcfTZ8+naysLPbs2cO1a9faBgVu27bttn6TVjk5OVy+fJlDhw6xZMmSLq8QExHpSMomioHY3DJx4sTbxoBkZmZiZtTX17fNbTV48OC2RNraPLZ+/XrmzJnDyJEjuXjxImbGmDFjBuR5FJE7k7KJoqtLYweK1o7zvLy8W/o4WlVUVNDS0sLq1atvm+MKgvO4bNmyW2bWbWlp4dq1a2RlZdHc3HzLNhEZeFLu27b1l3Lr1ODStUgkQkVFBVeuXCEjI4NIJMKuXbuoq6ujqakp5uBACMaezJo1S+M6RAaolEsUrTUJ/cq9c9GDD+fMmcOcOXP45JNP2mbYNTNGjBhBSUkJI0eOpLm5mbfeeou6ujrq6upuKWfSpEmMHz+eI0eOUF1dzezZszu87FdEUl/KXfU0d+5c/8Y3vjGgrnhKtmvXrnHp0iW2bNnS5b55eXk8+OCDvRCViNyJAXXVU6oltv4gKyuLrKysW5JzfX09Fy9eZOLEiW2d6Zs3b6a+vp5Vq1b1yNgSEekblCikW/Ly8m4Z8JiXl0dFRUXbQMEdO3awa9cusrKyWLx4MefOnSMtLY2MjAyGDh2Ku+uKK5EUkXKJQvq2rKwsVq5cyZo1azAzrly5wmuvvdbp/uXl5eTn5xOJRNRRLtJHKVFIj4tEIixfvhyAc+fO0dTURH5+Pi0tLZw9e5ZIJEJjYyNVVVVs3br1lteOHz+e++67T0lDpA9RopCEir7zYCQSuWXCxLFjx9Lc3EwkEuHy5ctUV1dz7Ngxjh07BgR3JpwyZQqDBg3q9bhF5KaUSxT6pdm/tM5BNXToUB544AFu3LjBu+++y9WrV6mpqaGmpgYIbiBVUFCQxEhFBq64L481s3HABKKSi7tvSFBcnSovL/f2zRXSf12/fp0NGzZw9epVAEaPHk1RURFjx45NcmQiqSXhl8ea2Z8CnwH2Ac3hagdiJgozexL4CyAN+Gt3/y8d7PNp4OtheTvd/ZfiDV76v8zMTJYtW8apU6eoqanh7NmznD59mqqqKiC4P/rUqVPJzc3VIEyRBIm36enngKnu3hhvwWaWBnwHeAyoAyrN7GV33xe1Twnw+8Aidz9nZqPjD10GkjFjxjBmzBgguNPgpUuX2L9/PxcvXmy7/Wx2djbjxo1j0qRJuk+HSA+KN1EcBjKAuBMF8ABw0N0PA5jZj4FnCGolrb4EfMfdzwG4++k7KF8GqOzsbLKzs2/ps9i9ezdmxtGjR/nwww/b1g8aNIjm5maamppYsWKF7ssh0g3xJoorwA4ze5uoZOHuX4nxmnFAbdRyHTC/3T5TAMxsI0Hz1NfdPfYMdSIduO+++wAoLS1tSwwHDhxg6NCh1NfX8/HHH/Pee++xePHiJEcqknriTRQvh4870VG9v33PeTpQAiwBCoF3zazU3c/fUpDZC8ALgCaeky6lpaWRlpbGrFmzgOA+Hvv27ePQoUNt9+dotWDBAkaNGqWahkgMcSUKd/9bM8skrAEAB9z9RhcvqwOiJ/spBE50sM+WsKwjZnaAIHFUtjv+i8CLEFz1FE/MItFmzJjBqFGjaGxsJC8vj+PHj3PgwAG2bNnCsGHDmDZtGjk5OWRnZyc7VJE+J96rnpYAfwvUENQUiszsV7q4PLYSKDGzicBx4Fmg/RVNPwOeA14yszyCRHT4Tt6ASLyi+zSmTJnClClTqKqq4tSpU3zwwQdt2/Ly8pg1a5aShkgo3qanbwOPu/sBADObAvwIuL+zF7h7k5l9GVhD0P/wA3ffa2bfBLa6+8vhtsfNrPWy299z97Pdfzsid+b++2/+F25qaqK+vp79+/ezdu1aCgoKKC8v19VTMuDFNeDOzHa5+6yu1vUGDbiT3nDs2DF27tyJmTFz5kwKCgra7q4okop6434UW83s+8APw+XPAlXdOaBIKhg/fjx5eXns2rWLPXv2sGfPHoYMGUJhYSGFhYVqlpIBJd4axSDgN4GHCPooNgB/eScD8HqKahTS25qbm9m9eze1tbVkZmZy/fp1ILgtb1NTE+PGjSM/P183apI+7W5qFCl3K1QlCkm206dPc/r0aa5cucKpU6du2WZmuDsLFy4kNzc3SRGK3C5hTU9m9hN3/7SZ7eb2MRAko49CJNlGjx7N6NG3zjbT0tJCU1MTp06dYseOHWzatIlhw4aRl5fHkCFDGDlyJCNGjEhSxCJ3p6s+iq+G/z6V6EBEUlkkEiEzM5OioiKKioo4fPgwe/fu5eLFi7fsV1hYyJw5c5IUpUj3xEwU7n4yfFoPXHX3lvDS2GlA5/e3FBngJk2axKRJk9qWm5qaqKqqoq6ujrq6OvLz85k7dy6ZmZlJjFIkPvF2ZlcBnwJGAluArcAVd/9sYsO7nfooJJWdO3eOy5cvs2PHjrZ106dPZ+LEiZpGRBKqNy6PNXe/YmZfBP6nu3/LzLZ354AiA9nIkSMZOXIkRUVFNDU1sX//fqqrqzlw4ABlZWXk5uaSlZWV7DBFbhF3ojCzBwnGT3zxDl8rIh1IT0+ntLSUqVOnUllZybZt24BgGvUZM2YwZswYjQqXPiHeL/vfIrjB0D+H03BMAtYlLiyRgSMjI4OFCxcCcObMGbZs2dJ2M6a0tDQ+9alPMXTo0GSGKAOcxlGI9EHuTmNjIxs2bKCxsZH58+ffdkmuyJ1I5DiK/+7uv2Vmq+h4HMXT3TmoiMRmZmRlZfH444+zceNG3n//fUaNGsWiRYuSHZoMQF01PbXO7fRfEx2IiHRs0aJFnD59mvfff59Vq1Yxbdo07r33XiKRSLJDkwEi3stjswnHUYTLacAgd7+S4Phuo6YnGaguXbrEhx9+yIkTwf2/hg0bxvjx4ykuLlant3Qp4XM9mdkWYJm7Xw6Xc4A33H1hdw56N5QoZKC7ceMGp0+f5ujRo5w9G9y+paysjMLCQiUM6VRvjKPIak0SAO5+2cyGdOeAInJ3MjIyGDduHOPGjaOxsZHdu3ezY8cOduzYQXFxMVOmTGHQoEHJDlP6kXgTRYOZzXX3bQBmdj9wNXFhiUg8Bg0aRHl58CPxzTffpKamhpqaGgCysrKYO3euZrGVu3Yn4yh+amYnwuV7gM8kJiQR6Y7HHnsMCGayvXjxIpWVlWzatImSkhJGjBhBbm4uGRkZSY5SUlHc4yjMLAOYSnDjov3ufiORgXVGfRQi8Vu7di0ADQ0NQDCvVGFhoaYJGYAS3kcR9kf8NjDB3b9kZiVmNtXdX+nOQUWkdzzyyCNAMIBv37591NTUUF1dzfDhwykvL2fIEHU1StfibXr6G4J7ZD8YLtcBPwWUKERSgJkxc+ZMZs6cyZkzZzh48CBvv/02ANOmTWP8+PHqAJdOxZso7nX3z5jZcwDuftV0HZ5ISsrPzyc/P5+mpiYqKyvZv38/+/fvZ/r06dx77726xFZuE2+iuG5mgwmn8TCze4HGhEUlIgmXnp7Ogw8+yI0bN/joo4+orq6murqa/Px8xo0bR1FRUbJDlD4i3kTxh8DrQJGZ/R2wCHg+UUGJSO/JyMhgxowZTJs2jUOHDrXdWGnHjh1MnDiR0tLSZIcoSdZlogibmPYDvwAsILjq6avuXp/g2ESkF0UiEUpKSgCYPXs2b775JkeOHGnr35CBK+5bobr7/b0QT5d0eaxI79m5cyfHjh0DIDMzk9zcXIYMGcLUqVN169YU0xtTeGwxs3nuXtmdg4hIapo9ezaTJ0+mpqaGzMxMamtrOXnyJIcOHWLMmDE88MADyQ5RekG8NYp9BIPtaoAGguYnd/dZCY2uA6pRiCSXu3P8+HG2b9/OkCFDmDdvHllZWWRkZOiKqT6sN2oUy7tTuIj0P2ZGYWEhQ4cOZfv27WzYsIHWH5wjR45k+vTpml+qn+nqDndZwK8Dk4HdwPfdvak3AhORvm348OEsWbKkbbmhoYGqqio2bdoEQE5ODsXFxbpfRj/QVY3ib4EbwLsEtYoZwFcTHZSIpJ7s7GwefvhhmpubOXnyJPX19ezZs4c9e/awcOFC1TJSWMw+CjPb7e73hc/TgQ/cfW5vBdcR9VGIpI7m5mZeffVVACZMmEBhYSGjRo1KclQDUyL7KNpmiHX3JlUfReROpKWlUVFRwcGDB6murubo0aMAPPHEE2RmZiY5OolXV3dnn21mF8PHJWBW63Mzu9hV4Wb2pJkdMLODZva1GPv9opm5mXUr24lI3zZ58mQqKiqYN28eAHv37k1yRHInYiYKd09z92HhY6i7p0c9HxbrtWaWBnyHm30bz5nZjA72Gwp8BXi/+29DRFJBQUEBJSUl1NXVtd3vW/q+rmoUd+MB4KC7H3b368CPgWc62O+PgG8B1xIYi4j0EdOmTSMvL49NmzaxatUqjh07RnNzc7LDkhgSmSjGAbVRy3XhujZmNgco6uoGSGb2gpltNbOtZ86c6flIRaRXPfjggyxbtgwzY+fOnbz66qtcvnw52WFJJxKZKDrq+W67xMrMIsB/A36nq4Lc/UV3L3f38vz8/B4MUUSSZfDgwTz11FOsXLmSrKws1q1bx8GDB5MdlnQgkYmiDoie0L4QOBG1PBQoBdabWQ3BzLQvq0NbZGCJRCIsW7aMiRMnUl1dzbZt25IdkrSTyERRCZSY2UQzywSeBV5u3ejuF9w9z92L3b0Y2AI87e4aJCEywJgZpaWlPPTQQxw/fhyNlepbEpYowqk+vgysAaqBn7j7XjP7ppk9najjikjqGjlyJGVlZZw8eZLt27er36KPiGv22L5EI7NF+r+TJ0+yb98+rly5wty5cxk3blzXL5KY7mZkdiKbnkREuuWee+5h6dKljBo1im3btvHaa69x6NChZIc1YMU7zbiISK+KRCIsWrSIGzdu8NFHH7Fv3z6am5uZMmVKskMbcJQoRKRPy8jIYMaMGWRkZLB//37S09OZNGlSssMaUJQoRCQllJSUcPXq1bZ5opQseo8ShYikjFmzZpGens7evXupqamhrKxM05b3AnVmi0hKmTFjBo899hhZWVls3LiRvXv3cv369WSH1a8pUYhIysnKymLhwoXMmjWLw4cPs2bNGhoaGpIdVr+lRCEiKWvChAmsWLGCrKws1q5dy86dO2lpaUl2WP2O+ihEJKWlpaXx2GOPUVNTw+7duzl27BgFBQWUl5eju3L2DCUKEekXiouLGT16NMePH2f//v288sor5Ofns2DBgmSHlvI0hYeI9Esff/wxlZWVQJBEpk6dOqDv060pPERE2ikoKKCiooKysjJqa2tZs2YNH3/8cbLDSklKFCLSrxUVFbFixQry8/OprKxk165dupz2DqmPQkQGhAULFrQ1Rx09epShQ4cya9YsDdiLg2oUIjJgtDZHLV26lObmZjZu3EhVVRXNzc3JDq1PU6IQkQEnJyeHRx99lHnz5nHixAleffVVLl26lOyw+iwlChEZsAoKCnjqqacAWL9+vfouOqFEISIDmpnx5JNPArBmzRquXr2a5Ij6HiUKERnwMjIyWLlyJUOGDOGtt97i6NGjyQ6pT1GiEBEhuKPeo48+yqRJk9i1axfHjh1Ldkh9hi6PFRGJMnPmTCKRCDt37uTkyZPMnDmTnJycZIeVVKpRiIi0M336dObNm8fly5dZt24d58+fT3ZISaUahYhIBwoKCigoKGDr1q28++67DBs2jIULF5KRkZHs0HqdahQiIjGUl5fzyCOPcOnSJV5//XUOHjw44O55oUQhItLvCBDOAAALW0lEQVSF7OxsVq5cyeTJk6murmb16tWcOHEi2WH1GiUKEZE4mBnTp0+noqKC3NxcqqqqWL16NfX19ckOLeGUKERE7tDChQt54oknGDZsGJs3b+by5cvJDimhlChERLohMzOThx56CIB169Zx8ODBJEeUOEoUIiLdZGY89dRTjB8/nurqalatWtUvO7qVKERE7oKZMXv2bFauXAnA6tWraWxsTHJUPUuJQkSkB0QiERYvXgzAG2+8wZkzZ5IcUc/RgDsRkR4ybNgwli9fztatW9myZQsQjPKePHlykiO7OwmtUZjZk2Z2wMwOmtnXOtj+22a2z8x2mdnbZjYhkfGIiCRaeno6CxYsoKKigmnTplFdXU1lZWWyw7orCUsUZpYGfAdYDswAnjOzGe122w6Uu/ss4B+BbyUqHhGR3lZSUsL8+fP5+OOP2bBhA+6e7JC6JZE1igeAg+5+2N2vAz8Gnonewd3XufuVcHELUJjAeEREet3o0aN55JFHuHDhAhs2bKChoSHZId2xRCaKcUBt1HJduK4zXwRe62iDmb1gZlvNbGt/6iASkYEhOzubhQsX4u6sXbuWPXv2JDukO5LIRGEdrOuw3mVmnwPKgT/raLu7v+ju5e5enp+f34Mhioj0jtzcXBYvXkxZWRlHjhzhnXfeSZnaRSITRR1QFLVcCNw2i5aZLQP+I/C0u/evi49FRKKYGUVFRSxatIiLFy+ydu3alLhHdyITRSVQYmYTzSwTeBZ4OXoHM5sDfI8gSZxOYCwiIn3GqFGj2gbobd68uc93cicsUbh7E/BlYA1QDfzE3fea2TfN7Olwtz8DcoCfmtkOM3u5k+JERPqVSCTCsmXLaGho4JVXXqG2trbrFyWJ9fVM1l55eblv3bo12WGIiPSI5uZmKisrOXPmDOnp6cyePZuxY8f2+HHMrMrdy7vzWo3MFhFJorS0NBYsWEBLSws7d+6kqqqKK1eu9KnR3JrrSUSkD4hEIsyZM4eysjKqq6t59913kx1SGyUKEZE+pKioiIcffpjz58/z2muvcfjw4WSHpEQhItLXDB8+nOXLl3Pvvfeyd+9edu/endR4lChERPqg9PR0pkyZQmFhITU1Ndy4cSNpsShRiIj0YbNmzQJIaq1CiUJEpA9LS0ujuLiY48eP8+GHHyYlBl0eKyLSx82cOZOMjAwOHDjA2bNnKS8vJyMjo9eOrxqFiEgfF4lEmDZtGvPnz6e+vp533nmnd4/fq0cTEZFuGz16NIsXL+bq1aucOnWq146rRCEikkKGDRvGxIkT+eCDDzh+/HivHFN9FCIiKaa0tJTr16+zbds2GhsbmTRpUkKPp0QhIpKC5s6dy5AhQ9i7dy/Z2dmMGTMmYcdS05OISIqaNm0aI0aM4IMPPuDEidvuC9djlChERFLYQw89xIgRI6iqquL69esJOYYShYhICjMz5s+fD8CePXsScgwlChGRFJeZmcnEiRM5fvw4LS0tPV6+EoWISD9QWloKwJUrV3q8bCUKEZF+ZOPGjT1ephKFiEg/8cgjj3D9+nXee++9Hi1XiUJEpJ/Izs5m6dKlnDt3jiNHjvRYuUoUIiL9SE5ODqNGjaKurq7HylSiEBHpZ2bOnMn58+c5f/58j5SnRCEi0s+MGDGCYcOGsWnTph4pT4lCRKQfKisro7m5uUeuglKiEBHph4YPH87s2bP55JNP7rosJQoRkX6qqKgIgIsXL95VOUoUIiL9lJmRk5PDjh077qocJQoRkX6stLSUa9eu3VUZShQiIv1Ybm4ujY2Nd1WGEoWISD9mZnddhhKFiEg/ZmaMGDHirspQohAR6ec+9alP3dXrE5oozOxJMztgZgfN7GsdbB9kZv8Qbn/fzIoTGY+IiNy5hCUKM0sDvgMsB2YAz5nZjHa7fRE45+6Tgf8G/Gmi4hERke5JZI3iAeCgux929+vAj4Fn2u3zDPC34fN/BB61nuh5ERGRHpOewLLHAbVRy3XA/M72cfcmM7sA5AL10TuZ2QvAC+Fio5kl5g7iqSePdudqANO5uEnn4iadi5umdveFiUwUHdUMvBv74O4vAi8CmNlWdy+/+/BSn87FTToXN+lc3KRzcZOZbe3uaxPZ9FQHFEUtFwInOtvHzNKB4cDdz2AlIiI9JpGJohIoMbOJZpYJPAu83G6fl4FfCZ//IrDW3W+rUYiISPIkrOkp7HP4MrAGSAN+4O57zeybwFZ3fxn4PvBDMztIUJN4No6iX0xUzClI5+ImnYubdC5u0rm4qdvnwvQDXkREYtHIbBERiUmJQkREYuqziULTf9wUx7n4bTPbZ2a7zOxtM5uQjDh7Q1fnImq/XzQzN7N+e2lkPOfCzD4d/t/Ya2Z/39sx9pY4/kbGm9k6M9se/p2sSEaciWZmPzCz052NNbPA/wjP0y4zmxtXwe7e5x4End+HgElAJrATmNFun98Avhs+fxb4h2THncRzsRQYEj7/1wP5XIT7DQU2AFuA8mTHncT/FyXAdmBkuDw62XEn8Vy8CPzr8PkMoCbZcSfoXDwMzAX2dLJ9BfAawRi2BcD78ZTbV2sUmv7jpi7Phbuvc/cr4eIWgjEr/VE8/y8A/gj4FnB3t/Xq2+I5F18CvuPu5wDc/XQvx9hb4jkXDgwLnw/n9jFd/YK7byD2WLRngP/jgS3ACDO7p6ty+2qi6Gj6j3Gd7ePuTUDr9B/9TTznItoXCX4x9EddngszmwMUufsrvRlYEsTz/2IKMMXMNprZFjN7stei613xnIuvA58zszrgVeDf9E5ofc6dfp8AiZ3C42702PQf/UDc79PMPgeUA4sTGlHyxDwXZhYhmIX4+d4KKIni+X+RTtD8tISglvmumZW6+/kEx9bb4jkXzwEvufu3zexBgvFbpe7ekvjw+pRufW/21RqFpv+4KZ5zgZktA/4j8LS7390Ncvuurs7FUKAUWG9mNQRtsC/30w7teP9G/p+733D3I8ABgsTR38RzLr4I/ATA3TcDWQQTBg40cX2ftNdXE4Wm/7ipy3MRNrd8jyBJ9Nd2aOjiXLj7BXfPc/didy8m6K952t27PRlaHxbP38jPCC50wMzyCJqiDvdqlL0jnnNxDHgUwMymEySKM70aZd/wMvD58OqnBcAFdz/Z1Yv6ZNOTJ276j5QT57n4MyAH+GnYn3/M3Z9OWtAJEue5GBDiPBdrgMfNbB/QDPyeu59NXtSJEee5+B3gr8zs3xI0tTzfH39YmtmPCJoa88L+mD8EMgDc/bsE/TMrgIPAFeALcZXbD8+ViIj0oL7a9CQiIn2EEoWIiMSkRCEiIjEpUYiISExKFCIiEpMShUg7ZtZsZjvMbI+ZrTKzET1c/vNm9r/C5183s9/tyfJFepoShcjtrrp7mbuXEozR+c1kBySSTEoUIrFtJmrSNDP7PTOrDOfy/0bU+s+H63aa2Q/DdRXhvVK2m9lbZjYmCfGL3LU+OTJbpC8wszSCaR++Hy4/TjBX0gMEk6u9bGYPA2cJ5tla5O71ZjYqLOI9YIG7u5n9K+DfEYwQFkkpShQitxtsZjuAYqAKeDNc/3j42B4u5xAkjtnAP7p7PYC7t05OWQj8QzjffyZwpFeiF+lhanoSud1Vdy8DJhB8wbf2URjwn8P+izJ3n+zu3w/XdzQXzv8E/pe73wf8GsFEdCIpR4lCpBPufgH4CvC7ZpZBMOncr5pZDoCZjTOz0cDbwKfNLDdc39r0NBw4Hj7/FURSlJqeRGJw9+1mthN41t1/GE5RvTmcpfcy8LlwptI/Ad4xs2aCpqnnCe6q9lMzO04w5fnEZLwHkbul2WNFRCQmNT2JiEhMShQiIhKTEoWIiMSkRCEiIjEpUYiISExKFCIiEpMShYiIxPT/ASrOcmPpIQM5AAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "parse_configs_and_run(configs, lr=0.01)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Help section (FAQs)\n",
    "\n",
    "#### How to select `n_steps` (how online sampling works)\n",
    "\n",
    "If you are using the `IntervalsSampler` (like we do in this example), you have passed in a list of regions from which you'd like to sample. If you are using `RandomPositionsSampler`, you are using the entire genome (minus any blacklist regions, see documentation [here](http://selene.flatironinstitute.org/sequences.html#genome)) for sampling. \n",
    "\n",
    "An `OnlineSampler` will randomly select a position from the regions you'd like to sample and query the reference sequence for a sequence of `sequence_length` bp centered at that position. \n",
    "\n",
    "When selecting `n_steps`, you should consider the number of possible positions from which the online sampler can sample, and how many times you might want to re-sample from that set. \n",
    "\n",
    "#### How to select: `report_stats_every_n_steps`, `n_validation_samples`, and `n_test_samples`\n",
    "\n",
    "We currently report ROC AUC and average precision averaged across all classes (e.g. genomic features) predicted by the model for both the validation and the test sets.\n",
    "\n",
    "`report_stats_every_n_steps` may be renamed in the future--when working with an online sampler, it can be used to report the validation ROC AUC and average precision after every estimated \"epoch\". You can set it to report more frequently than this if you'd like. \n",
    "\n",
    "Both `n_validation_samples` and `n_test_samples` should be larger enough such that we expect to sample positive examples for each feature. In the tutorial example here, we have set `sample_negative` to `True` because our model is only classifying whether a sequence contains 1 genomic feature or not. In a multi-class example, we may not need to do this because a positive example for 1 class might be a negative example for all other classes. In any case, you should estimate the number of samples you might randomly sample from regions in the genome to get to at least 1 positive example. **Otherwise, `roc_auc` and `average_precision` will always be reported as `None`** because all your samples will be labeled with 0 classes/genomic features.\n",
    "\n",
    "Selene's `TrainModel` class by default sets `report_gt_feature_n_positives` to 10. This means you must have at least 10 positive examples for a genomic feature for the ROC AUC and average precision to be computed. \n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "test-selene-env",
   "language": "python",
   "name": "test-selene-env"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
